Skip to content

feat(parser): Snowflake VARIANT colon-path (#483)#496

Merged
ajitpratap0 merged 4 commits intomainfrom
feat/snowflake-variant-path-483
Apr 8, 2026
Merged

feat(parser): Snowflake VARIANT colon-path (#483)#496
ajitpratap0 merged 4 commits intomainfrom
feat/snowflake-variant-path-483

Conversation

@ajitpratap0
Copy link
Copy Markdown
Owner

Summary

Adds support for Snowflake VARIANT path expressions — the largest Snowflake parser gap from the QA sweep. Required for every JSON / semi-structured workload.

What now parses

  • `col:field`
  • `col:field.sub`
  • `col:field.sub::string`
  • `col:items[0]`
  • `col:items[0].name`
  • `PARSE_JSON(raw):a::int`
  • `col:"weird key"`
  • `WHERE payload:status::string = 'active'`
  • `col:a.b.c.d::int AS x`

Grammar

```
path := primary ':' step ( '.' field | '[' expr ']' | ':' field )*
step := identifier | double-quoted-string | keyword
```

The VARIANT path runs before the `::` cast loop in `parseJSONExpression` so that `col:a.b::int` casts the full path rather than just the innermost field.

Implementation

  • AST: new `VariantPath` node with `Root Expression` and `Segments []VariantPathSegment` (each segment is either a `Name` or an `Index` expression). `Children()` includes the root and any bracket indices so visitor traversal works.
  • Parser: new `parseSnowflakeVariantPath` and `parseVariantFieldName` helpers in `expressions.go`. Gated to the Snowflake dialect — the colon token is used differently (e.g. Go placeholder `:1`) in other dialects, so gating is essential.
  • Inserted into `parseJSONExpression` immediately after `parsePrimaryExpression` and before the existing `::` cast loop, which then wraps the resulting path in a `CastExpression` transparently.

Test plan

  • `TestSnowflakeVariantPath`: 9 shapes covering all QA failure cases plus WHERE-clause and alias usage
  • `TestVariantPathASTShape`: asserts the `VariantPath` is produced with 3 segments (field / sub / [0]) and that the trailing `::string` cast wraps it correctly
  • `go test -race ./pkg/sql/parser/ ./pkg/sql/ast/ ./pkg/gosqlx/` green

Part of #483. Closes the biggest Snowflake gap from the QA sweep.

🤖 Generated with Claude Code

@vercel
Copy link
Copy Markdown

vercel bot commented Apr 8, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
gosqlx-website Ready Ready Preview, Comment Apr 8, 2026 2:22pm

@github-actions github-actions bot added parser ast AST node changes labels Apr 8, 2026
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 8, 2026

Claude finished @ajitpratap0's task —— View job


PR Review: Snowflake VARIANT Path Expressions ✅

This PR successfully adds comprehensive support for Snowflake VARIANT path expressions. After analyzing the implementation, I'm providing detailed feedback below.

Review Tasks

  • Read and analyze changed files
  • Assess code quality and best practices
  • Check for potential bugs and issues
  • Evaluate performance considerations
  • Review security concerns
  • Analyze test coverage
  • Provide comprehensive feedback

Code Quality & Architecture ⭐

Excellent implementation that follows established patterns in the codebase:

  • AST Design (ast.go:1053-1084): Clean VariantPath node with proper Children() implementation for visitor pattern support
  • Parser Integration (expressions.go:220-226): Correctly placed before :: cast loop, ensuring col:a.b::int casts the full path
  • Dialect Gating (expressions.go:220): Properly restricted to Snowflake dialect - essential for avoiding conflicts with other dialects' colon usage

Implementation Strengths

1. Proper Precedence Handling

The integration point in parseJSONExpression is perfect - running before the cast loop ensures correct parsing of col:field::type expressions.

2. Robust Parser Logic (expressions.go:405-472)

  • Handles all documented cases: field names, quoted keys, bracket indices
  • Supports chaining: .field, [expr], and even :field continuation
  • Proper error handling with descriptive messages

3. AST Structure

type VariantPath struct {
    Root     Expression              // Base expression (col, PARSE_JSON(raw))
    Segments []VariantPathSegment    // Path steps: .field, [0], etc.
}

Clean separation between root and path segments, with proper Children() method including indices for visitor traversal.

Test Coverage ⭐

Comprehensive test suite with excellent coverage:

Functional Tests (9 scenarios)

  • Basic: col:field
  • Nested: col:field.sub
  • Casting: col:field.sub::string
  • Arrays: col:items[0].name
  • Complex: PARSE_JSON(raw):a::int
  • Edge cases: col:"weird key"

Structural Tests

TestVariantPathASTShape validates:

  • AST node creation with 3 segments
  • Cast expression wrapping
  • Proper visitor pattern integration

Potential Issues & Recommendations

1. Minor: Keywords as Field Names (expressions.go:465-469)

case p.isType(models.TokenTypeKeyword):
    // Keywords may appear as field names (e.g. TYPE, VALUE, STATUS)
    name := tok.Value
    p.advance()
    return name, nil

Good: Handles keywords as field names (common in JSON)
💡 Suggestion: Consider adding a test case like col:type.value to verify this works

2. Error Context (expressions.go:471)

return "", p.expectedError("field name after `:` or `.`")

Good: Clear error message, but could benefit from position context for better debugging

3. Memory Efficiency

The implementation correctly follows the project's object pooling patterns by integrating with existing AST node creation. No memory leaks expected.

Security Assessment ✅

No security concerns identified:

  • Input validation is handled by the tokenizer layer
  • No string interpolation or dynamic code generation
  • Proper bounds checking in parser logic
  • Dialect gating prevents cross-contamination

Performance Considerations ✅

Efficient implementation:

  • Single-pass parsing without backtracking
  • Minimal allocations (reuses existing slice patterns)
  • Properly integrated with existing parsing pipeline
  • No recursive calls beyond normal expression parsing

Minor Suggestions

1. Test Enhancement

Add a test for keywords as field names:

"keyword_field": `SELECT col:type.value FROM t`,

2. Documentation

The inline grammar comment is excellent:

// Grammar:
//   path := ':' step ( '.' field | '[' expr ']' )*
//   step := field | '"' quoted '"'

3. Edge Case Coverage

Consider testing nested bracket access: col:items[0][1] if Snowflake supports it.

Overall Assessment: ⭐⭐⭐⭐⭐

This is a high-quality implementation that:

  • ✅ Follows established codebase patterns
  • ✅ Has comprehensive test coverage
  • ✅ Handles edge cases properly
  • ✅ Is properly dialect-gated
  • ✅ Integrates cleanly with existing parser architecture
  • ✅ Solves the biggest Snowflake parser gap identified

Recommendation: APPROVE - This PR is ready to merge. The implementation is solid, well-tested, and addresses a critical Snowflake parsing gap with no significant issues identified.


Job run

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 8, 2026

Claude finished @ajitpratap0's task —— View job


PR Review: Snowflake VARIANT Path Expressions ✅

I've thoroughly analyzed this PR's implementation of Snowflake VARIANT path expressions. Here's my comprehensive review:

Review Tasks

  • Read project guidelines (CLAUDE.md)
  • Analyze AST changes (ast.go)
  • Review parser implementation (expressions.go)
  • Examine test coverage (snowflake_variant_path_test.go)
  • Assess code quality and best practices
  • Check for potential bugs and issues
  • Evaluate performance considerations
  • Review security concerns
  • Provide comprehensive feedback

Code Quality & Architecture ⭐

Excellent implementation that follows established patterns in the codebase:

  • AST Design (ast.go:1053-1084): Clean VariantPath node with proper Children() implementation for visitor pattern support
  • Parser Integration (expressions.go:220-226): Correctly placed before :: cast loop, ensuring col:a.b::int casts the full path
  • Dialect Gating (expressions.go:220): Properly restricted to Snowflake dialect - essential for avoiding conflicts with other dialects' colon usage

Implementation Strengths

1. Proper Precedence Handling

The integration point in parseJSONExpression is perfect - running before the cast loop ensures correct parsing of col:field::type expressions.

2. Robust Parser Logic (expressions.go:405-472)

  • Handles all documented cases: field names, quoted keys, bracket indices
  • Supports chaining: .field, [expr], and even :field continuation
  • Proper error handling with descriptive messages

3. AST Structure

type VariantPath struct {
    Root     Expression              // Base expression (col, PARSE_JSON(raw))
    Segments []VariantPathSegment    // Path steps: .field, [0], etc.
}

Clean separation between root and path segments, with proper Children() method including indices for visitor traversal.

Test Coverage ⭐

Comprehensive test suite with excellent coverage:

Functional Tests (9 scenarios)

  • Basic: col:field
  • Nested: col:field.sub
  • Casting: col:field.sub::string
  • Arrays: col:items[0].name
  • Complex: PARSE_JSON(raw):a::int
  • Edge cases: col:"weird key"

Structural Tests

TestVariantPathASTShape validates:

  • AST node creation with 3 segments
  • Cast expression wrapping
  • Proper visitor pattern integration

Potential Issues & Recommendations

1. Minor: Keywords as Field Names (expressions.go:465-469)

case p.isType(models.TokenTypeKeyword):
    // Keywords may appear as field names (e.g. TYPE, VALUE, STATUS)
    name := tok.Value
    p.advance()
    return name, nil

Good: Handles keywords as field names (common in JSON)
💡 Suggestion: Consider adding a test case like col:type.value to verify this works

2. Error Context (expressions.go:471)

return "", p.expectedError("field name after `:` or `.`")

Good: Clear error message, but could benefit from position context for better debugging

3. Memory Efficiency

The implementation correctly follows the project's object pooling patterns by integrating with existing AST node creation. No memory leaks expected.

Security Assessment ✅

No security concerns identified:

  • Input validation is handled by the tokenizer layer
  • No string interpolation or dynamic code generation
  • Proper bounds checking in parser logic
  • Dialect gating prevents cross-contamination

Performance Considerations ✅

Efficient implementation:

  • Single-pass parsing without backtracking
  • Minimal allocations (reuses existing slice patterns)
  • Properly integrated with existing parsing pipeline
  • No recursive calls beyond normal expression parsing

Minor Suggestions

1. Test Enhancement

Add a test for keywords as field names:

"keyword_field": `SELECT col:type.value FROM t`,

2. Documentation

The inline grammar comment is excellent:

// Grammar:
//   path := ':' step ( '.' field | '[' expr ']' )*
//   step := field | '"' quoted '"'

3. Edge Case Coverage

Consider testing nested bracket access: col:items[0][1] if Snowflake supports it.

Overall Assessment: ⭐⭐⭐⭐⭐

This is a high-quality implementation that:

  • ✅ Follows established codebase patterns
  • ✅ Has comprehensive test coverage
  • ✅ Handles edge cases properly
  • ✅ Is properly dialect-gated
  • ✅ Integrates cleanly with existing parser architecture
  • ✅ Solves the biggest Snowflake parser gap identified

Recommendation: APPROVE - This PR is ready to merge. The implementation is solid, well-tested, and addresses a critical Snowflake parsing gap with no significant issues identified.


Job run

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 8, 2026

Claude finished @ajitpratap0's task —— View job


PR Review: Snowflake VARIANT Path Expressions ✅

I've thoroughly analyzed this PR's implementation of Snowflake VARIANT path expressions. Here's my comprehensive review:

Review Tasks

  • Read project guidelines (CLAUDE.md)
  • Analyze AST changes (ast.go)
  • Review parser implementation (expressions.go)
  • Examine test coverage (snowflake_variant_path_test.go)
  • Assess code quality and best practices
  • Check for potential bugs and issues
  • Evaluate performance considerations
  • Review security concerns
  • Provide comprehensive feedback

Code Quality & Architecture ⭐

Excellent implementation that follows established patterns in the codebase:

  • AST Design (ast.go:1053-1084): Clean VariantPath node with proper Children() implementation for visitor pattern support
  • Parser Integration (expressions.go:220-226): Correctly placed before :: cast loop, ensuring col:a.b::int casts the full path
  • Dialect Gating (expressions.go:220): Properly restricted to Snowflake dialect - essential for avoiding conflicts with other dialects' colon usage

Implementation Strengths

1. Proper Precedence Handling

The integration point in parseJSONExpression is perfect - running before the cast loop ensures correct parsing of col:field::type expressions.

2. Robust Parser Logic (expressions.go:405-472)

  • Handles all documented cases: field names, quoted keys, bracket indices
  • Supports chaining: .field, [expr], and even :field continuation
  • Proper error handling with descriptive messages

3. AST Structure

type VariantPath struct {
    Root     Expression              // Base expression (col, PARSE_JSON(raw))
    Segments []VariantPathSegment    // Path steps: .field, [0], etc.
}

Clean separation between root and path segments, with proper Children() method including indices for visitor traversal.

Test Coverage ⭐

Comprehensive test suite with excellent coverage:

Functional Tests (9 scenarios)

  • Basic: col:field
  • Nested: col:field.sub
  • Casting: col:field.sub::string
  • Arrays: col:items[0].name
  • Complex: PARSE_JSON(raw):a::int
  • Edge cases: col:"weird key"

Structural Tests

TestVariantPathASTShape validates:

  • AST node creation with 3 segments
  • Cast expression wrapping
  • Proper visitor pattern integration

Potential Issues & Recommendations

1. Minor: Keywords as Field Names (expressions.go:465-469)

case p.isType(models.TokenTypeKeyword):
    // Keywords may appear as field names (e.g. TYPE, VALUE, STATUS)
    name := tok.Value
    p.advance()
    return name, nil

Good: Handles keywords as field names (common in JSON)
💡 Suggestion: Consider adding a test case like col:type.value to verify this works

2. Error Context (expressions.go:471)

return "", p.expectedError("field name after `:` or `.`")

Good: Clear error message, but could benefit from position context for better debugging

3. Memory Efficiency

The implementation correctly follows the project's object pooling patterns by integrating with existing AST node creation. No memory leaks expected.

Security Assessment ✅

No security concerns identified:

  • Input validation is handled by the tokenizer layer
  • No string interpolation or dynamic code generation
  • Proper bounds checking in parser logic
  • Dialect gating prevents cross-contamination

Performance Considerations ✅

Efficient implementation:

  • Single-pass parsing without backtracking
  • Minimal allocations (reuses existing slice patterns)
  • Properly integrated with existing parsing pipeline
  • No recursive calls beyond normal expression parsing

Minor Suggestions

1. Test Enhancement

Add a test for keywords as field names:

"keyword_field": `SELECT col:type.value FROM t`,

2. Documentation

The inline grammar comment is excellent:

// Grammar:
//   path := ':' step ( '.' field | '[' expr ']' )*
//   step := field | '"' quoted '"'

3. Edge Case Coverage

Consider testing nested bracket access: col:items[0][1] if Snowflake supports it.

Overall Assessment: ⭐⭐⭐⭐⭐

This is a high-quality implementation that:

  • ✅ Follows established codebase patterns
  • ✅ Has comprehensive test coverage
  • ✅ Handles edge cases properly
  • ✅ Is properly dialect-gated
  • ✅ Integrates cleanly with existing parser architecture
  • ✅ Solves the biggest Snowflake parser gap identified

Recommendation: APPROVE - This PR is ready to merge. The implementation is solid, well-tested, and addresses a critical Snowflake parsing gap with no significant issues identified.


Job run

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 8, 2026

Claude finished @ajitpratap0's task —— View job


PR Review: Snowflake VARIANT Path Expressions ✅

I've thoroughly analyzed this PR's implementation of Snowflake VARIANT path expressions. Here's my comprehensive review:

Review Tasks

  • Read project guidelines (CLAUDE.md)
  • Analyze AST changes (ast.go)
  • Review parser implementation (expressions.go)
  • Examine test coverage (snowflake_variant_path_test.go)
  • Assess code quality and best practices
  • Check for potential bugs and issues
  • Evaluate performance considerations
  • Review security concerns
  • Provide comprehensive feedback

Code Quality & Architecture ⭐

Excellent implementation that follows established patterns in the codebase:

  • AST Design (ast.go:1060-1091): Clean VariantPath node with proper Children() implementation for visitor pattern support
  • Parser Integration (expressions.go:220-226): Correctly placed before :: cast loop, ensuring col:a.b::int casts the full path
  • Dialect Gating (expressions.go:220): Properly restricted to Snowflake dialect - essential for avoiding conflicts with other dialects' colon usage

Implementation Strengths

1. Proper Precedence Handling

The integration point in parseJSONExpression is perfect - running before the cast loop ensures correct parsing of col:field::type expressions.

2. Robust Parser Logic (expressions.go:405-472)

  • Handles all documented cases: field names, quoted keys, bracket indices
  • Supports chaining: .field, [expr], and even :field continuation
  • Proper error handling with descriptive messages

3. AST Structure

type VariantPath struct {
    Root     Expression              // Base expression (col, PARSE_JSON(raw))
    Segments []VariantPathSegment    // Path steps: .field, [0], etc.
}

Clean separation between root and path segments, with proper Children() method including indices for visitor traversal.

Test Coverage ⭐

Comprehensive test suite with excellent coverage:

Functional Tests (9 scenarios)

  • Basic: col:field
  • Nested: col:field.sub
  • Casting: col:field.sub::string
  • Arrays: col:items[0].name
  • Complex: PARSE_JSON(raw):a::int
  • Edge cases: col:"weird key"

Structural Tests

TestVariantPathASTShape validates:

  • AST node creation with 3 segments
  • Cast expression wrapping
  • Proper visitor pattern integration

Potential Issues & Recommendations

1. Minor: Keywords as Field Names (expressions.go:465-469)

case p.isType(models.TokenTypeKeyword):
    // Keywords may appear as field names (e.g. TYPE, VALUE, STATUS)
    name := tok.Value
    p.advance()
    return name, nil

Good: Handles keywords as field names (common in JSON)
💡 Suggestion: Consider adding a test case like col:type.value to verify this works

2. Error Context (expressions.go:471)

return "", p.expectedError("field name after `:` or `.`")

Good: Clear error message, but could benefit from position context for better debugging

3. Memory Efficiency

The implementation correctly follows the project's object pooling patterns by integrating with existing AST node creation. No memory leaks expected.

Security Assessment ✅

No security concerns identified:

  • Input validation is handled by the tokenizer layer
  • No string interpolation or dynamic code generation
  • Proper bounds checking in parser logic
  • Dialect gating prevents cross-contamination

Performance Considerations ✅

Efficient implementation:

  • Single-pass parsing without backtracking
  • Minimal allocations (reuses existing slice patterns)
  • Properly integrated with existing parsing pipeline
  • No recursive calls beyond normal expression parsing

Minor Suggestions

1. Test Enhancement

Add a test for keywords as field names:

"keyword_field": `SELECT col:type.value FROM t`,

2. Documentation

The inline grammar comment is excellent:

// Grammar:
//   path := ':' step ( '.' field | '[' expr ']' )*
//   step := field | '"' quoted '"'

3. Edge Case Coverage

Consider testing nested bracket access: col:items[0][1] if Snowflake supports it.

Overall Assessment: ⭐⭐⭐⭐⭐

This is a high-quality implementation that:

  • ✅ Follows established codebase patterns
  • ✅ Has comprehensive test coverage
  • ✅ Handles edge cases properly
  • ✅ Is properly dialect-gated
  • ✅ Integrates cleanly with existing parser architecture
  • ✅ Solves the biggest Snowflake parser gap identified

Recommendation: APPROVE - This PR is ready to merge. The implementation is solid, well-tested, and addresses a critical Snowflake parsing gap with no significant issues identified.


Job run

@ajitpratap0 ajitpratap0 merged commit 144e3ca into main Apr 8, 2026
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ast AST node changes parser

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant